NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Concerning the Responsible Use of AI in the U.S. Criminal Justice System

https://doi.org/10.1145/3722548

Moore, Cristopher; Gill, Catherine; Bliss, Nadya; Butler, Kevin; Forrest, Stephanie; Lopresti, Dan; Maher, Mary Lou; Mentis, Helena; Shekhar, Shashi; Stent, Amanda; et al (September 2025, Communications of the ACM)

Seeking insight into AI decision-making processes to better address bias and improve accountability in AI systems.
more » « less
Full Text Available
Automated Program Repair: Emerging Trends Pose and Expose Problems for Benchmarks

https://doi.org/10.1145/3704997

Renzullo, Joseph; Reiter, Pemma; Weimer, Westley; Forrest, Stephanie (March 2025, ACM Computing Surveys)

Machine learning (ML) pervades the field of Automated Program Repair (APR). Algorithms deploy neural machine translation and large language models (LLMs) to generate software patches, among other tasks. But, there are important differences between these applications of ML and earlier work, which complicates the task of ensuring that results are valid and likely to generalize. A challenge is that the most popular APR evaluation benchmarks were not designed with ML techniques in mind. This is especially true for LLMs, whose large and often poorly-disclosed training datasets may include problems on which they are evaluated. This article reviews work in APR published in the field’s top five venues since 2018, emphasizing emerging trends in the field, including the dramatic rise of ML models, including LLMs. ML-based articles are categorized along structural and functional dimensions, and a variety of issues are identified that these new methods raise. Importantly, data leakage and contamination concerns arise from the challenge of validating ML-based APR using existing benchmarks, which were designed before these techniques were popular. We discuss inconsistencies in evaluation design and performance reporting and offer pointers to solutions where they are available. Finally, we highlight promising new directions that the field is already taking.
more » « less
Full Text Available
Automatically Mitigating Vulnerabilities in Binary Programs via Partially Recompilable Decompilation

https://doi.org/10.1109/TDSC.2024.3482413

Reiter, Pemma; Tay, Hui Jun; Weimer, Westley; Doupé, Adam; Wang, Ruoyu; Forrest, Stephanie (May 2025, IEEE Transactions on Dependable and Secure Computing)

PRD lifts suspect binary functions to source, available for analysis, revision, or review, and creates a patched binary using source- and binary-level techniques. Al- though decompilation and recompilation do not typically succeed on an entire binary, our approach does because it is limited to a few functions, such as those identified by our binary fault localization.
more » « less
Full Text Available
The Evolution of Automated Software Repair

https://doi.org/10.1109/TSE.2025.3533309

Le_Goues, Claire; Nguyen, ThanhVu; Forrest, Stephanie; Weimer, Westley (January 2025, IEEE Transactions on Software Engineering)

Full Text Available
Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics Applications

https://doi.org/10.1145/3703920

Liou, Jhe-Yu; Awan, Muaaz; Leyba, Kirtus; Šulc, Petr; Hofmeyr, Steven; Wu, Carole-Jean; Forrest, Stephanie (December 2024, ACM Transactions on Evolutionary Learning and Optimization)

GPUs are used in many settings to accelerate large-scale scientific computation, including simulation, computational biology, and molecular dynamics. However, optimizing codes to run efficiently on GPUs requires developers to have both detailed understanding of the application logic and significant knowledge of parallel programming and GPU architectures. This paper shows that an automated GPU program optimization tool, GEVO, can leverage evolutionary computation to find code edits that reduce the runtime of three important applications, multiple sequence alignment, agent-based simulation and molecular dynamics codes, by 28.9%, 29%, and 17.8% respectively. The paper presents an in-depth analysis of the discovered optimizations, revealing that (1) several of the most important optimizations involve significant epistasis, (2) the primary sources of improvement are application-specific, and (3) many of the optimizations generalize across GPU architectures. In general, the discovered optimizations are not straightforward even for a GPU human expert, showcasing the potential of automated program optimization tools to both reduce the optimization burden for human domain experts and provide new insights for GPU experts.
more » « less
Full Text Available
Self-organization in computation and chemistry: Return to AlChemy

https://doi.org/10.1063/5.0207358

Mathis, Cole; Patel, Devansh; Weimer, Westley; Forrest, Stephanie (September 2024, Chaos: An Interdisciplinary Journal of Nonlinear Science)

How do complex adaptive systems, such as life, emerge from simple constituent parts? In the 1990s, Walter Fontana and Leo Buss proposed a novel modeling approach to this question, based on a formal model of computation known as the λ calculus. The model demonstrated how simple rules, embedded in a combinatorially large space of possibilities, could yield complex, dynamically stable organizations, reminiscent of biochemical reaction networks. Here, we revisit this classic model, called AlChemy, which has been understudied over the past 30 years. We reproduce the original results and study the robustness of those results using the greater computing resources available today. Our analysis reveals several unanticipated features of the system, demonstrating a surprising mix of dynamical robustness and fragility. Specifically, we find that complex, stable organizations emerge more frequently than previously expected, that these organizations are robust against collapse into trivial fixed points, but that these stable organizations cannot be easily combined into higher order entities. We also study the role played by the random generators used in the model, characterizing the initial distribution of objects produced by two random expression generators, and their consequences on the results. Finally, we provide a constructive proof that shows how an extension of the model, based on the typed λ calculus, could simulate transitions between arbitrary states in any possible chemical reaction network, thus indicating a concrete connection between AlChemy and chemical reaction networks. We conclude with a discussion of possible applications of AlChemy to self-organization in modern programming languages and quantitative approaches to the origin of life.
more » « less
Full Text Available
SIMCoV-GPU: Accelerating an Agent-Based Model for Exascale

https://doi.org/10.1145/3625549.3658692

Leyba, Kirtus; Hofmeyr, Steven; Forrest, Stephanie; Cannon, Judy; Moses, Melanie (June 2024, ACM)

Full Text Available
Encrypted data-sharing for preserving privacy in wastewater-based epidemiology

https://doi.org/10.1016/j.scitotenv.2024.173315

Driver, Erin M; Ahsan, Manazir; Piske, Lucas; Lee, Heewook; Forrest, Stephanie; Halden, Rolf U; Trieu, Ni (August 2024, Science of The Total Environment)

The rapidly expanding use of wastewater for public health surveillance requires new strategies to protect privacy rights, while data are collected at increasingly discrete geospatial scales, i.e., city, neighborhood, campus, and building-level. Data collected at high geospatial resolution can inform on labile, short-lived biomarkers, thereby making wastewater-derived data both more actionable and more likely to cause privacy concerns and stigma- tization of subpopulations. Additionally, data sharing restrictions among neighboring cities and communities can complicate efforts to balance public health protections with citizens’ privacy. Here, we have created an encrypted framework that facilitates the sharing of sensitive population health data among entities that lack trust for one another (e.g., between adjacent municipalities with different governance of health monitoring and data sharing). We demonstrate the utility of this approach with two real-world cases. Our results show the feasibility of sharing encrypted data between two municipalities and a laboratory, while performing secure private com- putations for wastewater-based epidemiology (WBE) with high precision, fast speeds, and low data costs. This framework is amenable to other computations used by WBE researchers including population normalized mass loads, fecal indicator normalizations, and quality control measures. The Centers for Disease Control and Pre- vention’s National Wastewater Surveillance System shows ~8 % of the records attributed to collection before the wastewater treatment plant, illustrating an opportunity to further expand currently limited community-level sampling and public health surveillance through security and responsible data-sharing as outlined here.
more » « less
Full Text Available
Digging into Semantics: Where Do Search-Based Software Repair Methods Search?

https://doi.org/10.1007/978-3-031-14721-0_1

Ahmad, Hammad; Cashin, Padraic; Forrest, Stephanie; Weimer, Westley (September 2022, Parallel Problem Solving from Nature)

Full Text Available
Understanding the Power of Evolutionary Computation for GPU Code Optimization

https://doi.org/10.1109/IISWC55918.2022.00025

Liou, Jhe-Yu; Awan, Muaaz; Hofmeyr, Steven; Forrest, Stephanie; Wu, Carole-Jean (November 2022, IEEE International Symposium on Workload Characterization (IISWC))

Full Text Available

« Prev Next »

Search for: All records